May 24, 2019

Objectives

  • DFG project “The populist challenge in parliament” (2019-2021, in cooperation with Christian Stecker, Marcel Lewandowsky, Jochen Müller)

  • PolMine Project:
    • long history of experiments with cooccurrence graphs and graph visualisation
    • long history of difficulties to make sense of this analytical approach (derived from corpus linguistics) from the point of view of social science methodology
    • graph annotation widget (gradget) as a recent development: New workflow that combines qualitative and quantitative exploration of cooccurrence graphs
  • Substantial interest: Data-driven (but conceptually guided) reconstruction of AfD ideology.

  • Methodological interest:
    • Validity and intersubjectivity of data-driven, “distant reading” approaches (Moretti 2013)
    • Special focus: Interactive graph annotation as an approach to generate intersubjectively shared interpretations/understandings of discourse patterns.

Objectives and Theory

Mapping ideologies

  • anlysing ideolgies as the relational analysis of concepts:
    “[…] ideologies are configurations of political concepts - such as liberty, democracy, justice, and nationhood - in which particular interpretations of each constituent concept have been selected out of an indeterminate range of meanings they may signify.” (Freeden 1998, 748)

  • Mudde’s definition of populism:
    “I define populism as an ideology that considers society to be ultimately separated into two homogeneous and antagonistic groups, ‘the pure people’ versus ‘the corrupt elite’, and which argues that politics should be an expres- sion of the volonté générale (general will) of the people.” (Mudde 2004, 543)

  • referring to Freeden, Mudde sees populism as “thin-centered ideology” (Mudde 2004, 544)

  • cognitive-affective mapping as a recent approaches to analyse the relational nature of ideologies (Thomas Homer-Dixon 2013; Thomas Homer-Dixon and Thagard 2014)

Mapping the ideology of AfD

  • “end of ideology” (Bell 1960), “catch-all party” (Kirchheimer, n.d.), but persistent tradition to align parties with ideologies (such as conservatism, socialism, green political thought, liberalism)

  • perils of ascribing (thin) populist ideology to AfD:
    • AfD not to be classified as populist a priori
    • this is a matter of empirical research!
  • change within and beyond populism
    • possibility that AfD may enter a post-populist stage, and turns into a conservative, or nationalist party (parliamentarian transformation of AfD)
    • even if AfD is populist, alignments of populism with other ideologies, varieties of populism
  • requirement to have a research instrument to empirically reconstruct the ideology of AfD, including the potential internal variation if ideology

  • Q: Persistence of populist traits of AfD in parliament, (variation of) ideological alignments of AfD.

Data

The MigParl Corpus

  • corpora of parliamentary debates as hallmark of the PolMine Project: GermaParl

  • The following analysis is based on the MigTex corpus:
    • The corpus has been prepared in the MigTex Project (“Textressourcen für die Migrations- und Integrationsforschung”, funding: BMFSFJ)
    • Preparation of all plenary debates in Germany’s regional parliaments (2000-2018) using the “Framework for Parsing Plenary Protocols” (frappp-package)
    • Extraction of a thematic subcorpus using unsupervised learning (topic modelling)
  • Size of the MigParl corpus:
    27241205 tokens

  • size without interjections and presidency:
    22837376

  • MigParl corpus used as “proof of concept” - FedParl corpus (all debates in regional parliaments) to be used for the “real” / “full” analysis

MigParl by year

AfD in MigParl - tokens

AfD in MigParl - share

MigParl - regional dispersion

So what’s in the data?

  • (unsurprising) peak of debates on migration and integration affairs in 2015.

  • total number of words spoken by AfD parliamentarians and the relative share has increased, as the AfD made it into an increasing number of regional parliaments.

  • AfD presence is stronger in the Eastern regional states, corresponding to stronger electoral results there.

Method

The PolMine Project R Packages

  • The PolMine Project is essentially about using corpora in social science research:
    • polmineR: basic vocabulary for corpus analysis
    • RcppCWB: wrapper for the Corpus Workbench (using C++/Rcpp, follow-up on rcqp-package)
    • cwbtools: tools to create and manage CWB indexed corpora
  • objectives of the polmineR package:
    • working efficiently with large, structurally and linguistically annotated corpora
    • offer a basic vocabulary for corpus analysis
    • serve as a toolset to integrate qualitative and quantitative approaches to data in a seamless workflow

Quantity & quality

Opening the blackbox of dictionary-based corpus analysis

kwic("GERMAPARL", query = "Islam", positivelist = c(good, bad)) %>%
  highlight(lightgreen = good, orange = bad) %>%
  tooltips(setNames(sentiws[["word"]], sentiws[["weight"]])) %>%
  knit_print()

Outline of the workflow

  • data-driven identification of all significant cooccurrences

  • term extraction / calculating keywords as illustration / first analytical take on the data: conveys logic of using difference tests to bring out noteworthy features from a unit of analysis (subcorpus/word context)

  • To gain a first insight into the thematic foci and linguistic features of AfD speakers, we use the technique of term extraction (Baker 2006): get terms that occur more often in a corpus of interest (coi) compared to a reference corpus (ref) than would be expected by chance. The statistical test used is a chi-squared test (Rose et al. 1998).

  • equivalent logic of diffence test to get cooccurrences (log likelihood statistic) - but need to filter and to work with key cooccurrences (statistical significant cooccurrences)

Statistically significant words

  • The point of departure is a contingency table
coi ref TOTAL
count token \(o_{11}\) \(o_{12}\) \(r_{1}\)
other tokens \(o_{21}\) \(o_{22}\) \(r_{2}\)
TOTAL \(c_{1}\) \(c_{2}\) N


  • The chi-squared statistic is calculated as follows (for a single word)

\[ X^{2} = \sum{\frac{(O_{ij} - E_{ij})^2}{O_{ij}}}\]

Term extraction I

Term extraction II (ADJA - NN)

Term extraction III (NN-ART-NN)

Intermediate findings

  • assumed features of populism remain present when AfD arrived in parliament:
    • vocabulary that indicates the critique of established parties and elites.
    • foreigners and asylum-seekers are an object of concern (using pejorative language)
  • will results be different once we use the full FedParl corpus?

cooccurrences and cooccurrence graphs

  • the statistical/mathematical logic of calculating cooccurrences is identical with the one explained for term extraction, but the log likelihood test is used:

\[ G^{2} = 2 \sum{O_{ij} log(\frac{O_{ij}}{E_{ij}})} \]

  • calculation of all statistically significant cooccurrences in a corpus (effects of corpus size!) - available in polmineR starting with v0.7.9.11

  • Cooccurrence graphs are a popular eye-catcher (Scharloth, Eugster, and Bubenhofer 2013; Lemke and Wiedemann 2016).

  • The visualisations are very suggestive and seem to be a great condensation of ideas we have about discourse. But are these interpretations sound and do they meet standards of intersubjectivity?

Filtering and annotating cooccurrences

  • objective: get the significant cooccurrences of the AfD in parliamentary discourse: We are not just interested in all statistically significant cooccurrences, but more specifically in those that distinguish AfD speech-making from speeches made by parliamentarians of other factions.

  • second difference test (chi-squared statistic) with cooccurrences in speeches by all other parliamentarians. See the code for these slides to learn how this is implemented in polmineR.

  • Cp. CorporaCoCo R package (Hennessey et al. 2017), see also this research note on co-occurrence comparison techniques.

  • Towards intersubjectivity: three-dimensional, interactive graph visualisations that can be annotated (called gradgets, for graph annotation widgets).

Mapping AfD

AfD Cooccurrences (unfiltered)

Graph visualisation (2D, N = 100)

Graph-Visualisierung (2D, N = 250)

Graph-Visualisierung (2D, N = 400)

Where we stand

  • The graph layout depends heavily on filter decisions.

  • Filtering is necessary, but there are difficulties to justify filter decisions.

  • Graph visualisation implies many possibilities to provide extra information, but there are perils of information overload.

  • If we try to omit filter decisions, we run into the problem of overwhelming complexity of large graphs.

  • How to handle the complexity and create the foundations for intersubjectivity?

Graph visualisation (3D)

Conclusions

Intermediate conclusions

The results of this research are very preliminary:

  • (somewhat surprising) explicit politeness of AfD speakers.

  • It’s the economy: Introducing a redistributive logic as a leitmotiv.

  • There is no autism at all! But a lot of interaction with other parties (and visitors!).

  • Cultivating antagonisms: “Wir” (AfD / AfD-Fraktion) and the others.

  • It’s the economy: Introducing a redistributive logic as a leitmotiv.

But in a way, AfD speeches served only as a case how we might develop the idea of “visual hermeneutics” (Schaal, Kath, and Dumm 2016): If we decide to work with cooccurrence graphs, graph annotation is the approach suggested here to realise the idea of distant and close reading, and to achieve intersubjectivity.

Potential output (Thomas Homer-Dixon 2013)

Homer et. al 2013

Next steps

  • rethink and adjust filter decision (key occurrences may throw out ideologically significant vocabulary)

  • finally use FedParl corpus!
    • compare regional states
    • compare “Flügel” MPs and other MPs
  • really do the annotation - and invent a way how to publish three-dimensional visualisations

Feedback welcome!

References

Baker, Paul. 2006. Using Corpora in Discourse Analysis. Londing: continuum.

Bell, Daniel. 1960. New York: Collier.

Freeden, Michael. 1998. “Is Nationalism a Distinct Ideology?” Political Studies XLVI: 748–65.

Hennessey, Anthony, Viola Wiegand, Michaela Mahlberg, Christopher R. Tench, and Jamie Lentin. 2017. CorporaCoCo: Corpora Co-Occurrence Comparison. https://CRAN.R-project.org/package=CorporaCoCo.

Kirchheimer, Otto. n.d. “Der Wandel Des Westdeutschen Parteisystems.” Politische Vierteljahresschrift 6: 20–41.

Lemke, Matthias, and Gregor Wiedemann. 2016. Text Mining in Den Sozialwissenschaften: Grundlagen Und Anwendungen Zwischen Qualitativer Und Quantitativer Diskursanalyse. Wiesbaden: Springer.

Moretti, Franco. 2013. Distant Reading. London: Verso.

Mudde, Cas. 2004. “The Populist Zeitgeist,” 541–63.

Rose, Tony, Adam Kilgarriff, Adam Kilgarriff, and Tony Rose. 1998. “Measures for Corpus Similarity and Homogeneity.” In Proceedings of the 3rd Conference on Empirical Methods in Natural Language Processing, 46–52. ACL-SIGDAT.

Schaal, G.S., R. Kath, and S. Dumm. 2016. “New Visual Hermeneutics.” Cybernetics & Human Knowing 23 (2): 51–75.

Scharloth, Joachim, David Eugster, and Noah Bubenhofer. 2013. “Das Wuchern Der Rhizome. Linguistische Diskursanalyse Und Data-Driven Turn.” In Linguistische Diskursanalyse. Neue Perspektiven, edited by Dietrich Busse and Wolfgang Teubert, 345–80. Wiesbaden: VS Verlag.

Thomas Homer-Dixon, Matto Mildenberger, Jonathan Leader Maynard. 2013. “A Complex Systems Approach to the Study of Ideology: Cognitive-Affective Structures and the Dynamics of Belief Systems.” Journal of Social and Political Psychology 1: 337–63.

Thomas Homer-Dixon, Steven J. Mock, Manjana Milkoreit, and Paul Thagard. 2014. “The Conceptual Structure of Social Disputes: Cognitive-Affective Maps as a Tool for Conflict Analysis and Resolution.” SAGE Open, 1–20.